Community Evolution of Social Network: 
Feature, Algorithm and Model 



Yi Wang, Bin Wu, and Nan Du 

Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia, 
Beijing University of Posts and Telecommunications, Beijing 
wangyi . tsegOgmail . com 
{wubin , dmian}(§bupt . edu . en 



Abstract. Researchers have devoted themselves to exploring static fea- 
tures of social networks and further discovered many representative char- 
acteristics, such as power law in the degree distribution and assortative 
value used to differentiate social networks from nonsocial ones. However, 
people are not satisfied with these achievements and more and more at- 
tention has been paid on how to uncover those dynamic characteristics 
of social networks, especially how to track community evolution effec- 
tively. With these interests, in the paper we firstly display some basic 
but dynamic features of social networks. Then on its basis, we propose 
a novel core-based algorithm of tracking community evolution, Comm- 
Tracker, which depends on core nodes to establish the evolving relation- 
ships among communities at different snapshots. With the algorithm, we 
discover two unique phenomena in social networks and further propose 
two representative coefficients: GROWTH and METABOLISM by which 
we are also able to distinguish social networks from nonsocial ones from 
the dynamic aspect. At last, we have developed a social network model 
which has the capabilities of exhibiting two necessary features above. 



1 Introduction. 

Social network analysis has been a hot topic in the field of data mining. In the 
co-authorship network, a node is an author and a edge indicates a publishing 
collaboration between them. Researchers are interested in these special networks 
from which they discover power law in the degree distribution, that is, only a 
small proportion of nodes have high degrees while the rest has low degree. Social 
networks also present positive assortative values while in nonsocial networks, 
such as Internet, biology network, the values are always negative, indicating 
that in socdal networks, higher degree nodes trend to c;onnec;t with higher degree 
nodes while in nonsocial ones, it is largely possible that higher degree ones are 
linked with lower degree ones. Moreover, researchers reveal community struc- 
tures where the vertices within communities have higher density of edges while 

This work is supported by the National Science Foundation of China under grant 
number 60402011, and the National Science and Technology Support Program of 
China under Grant No.2006BAH03B05. 



vertices between communities have lower density of edges. In the co-authorship 
network, a community reflects a group of scholars with similar interest. Appar- 
ently, from these static characteristics, people have gained much understanding 
of social networks. However, we are not satisfied with these achievements, but 
will furthermore explore those dynamic features of social networks. For example, 
how can we track community evolution effectively? Does other dynamic features 
exist to distinguish social networks from nonsocial ones? How can we establish 
a more reasonable model of social network? 

With the interest to dynamic features of social networks, we firstly perform 
experiments in which after a long time duration has been divided into several 
snapshots, we find that about 80 percent of nodes appear in one or two snap- 
shots. The experiment indicates that most of nodes is so unstable that we can 
not rely on them too much. We also discover that node with higher degree will 
appear in more snapshots. On its basis, we propose a core-based algorithm called 
CommTracker to track community evolution effectively. With it, we not only find 
out a community evolution trace but also discover split or mergence points in 
the trace. By the algorithm, we find two unique phenomena of social networks. 
One is that a larger community leads to a longer life and the other is that a com- 
munity with a longer life trend to have lower member stability. Correspondingly, 
we propose two representative coefficients: GROWTH and METABOLISM, by 
which we are able to tell social networks from nonsocial ones. At last, we propose 
a more reasonable model which focuses on node change. The model successfully 
displays two important phenomena discovered above. 

We validate our conclusions in 11 datasets including 6 social networks: 3 co- 
authorship networks in cond-mat, math and nonlinear fields, a call network, an 
email networks and a movie actor network as well as 5 nonsocial ones involving 3 
software networks (tomcat 4, tomcat 5, ant), an Internet network, a vocabulary 
network. 

The rest of the paper is organized as follows: Section 2 reviews the related 
work. Section 3 gives definitions. Section 4 introduces some basic dynamic fea- 
tures of our dataset. Section 5 presents the core-based algorithm of tracking 
community evolution. Section 6 introduces two unique phenomena discovered in 
the social networks. Section 7 shows our model and Section 8 concludes. 

2 Related Work. 

A lot of work has been dedicated to exploring the characteristics of social net- 
works. Barabasi and Albert show an uneven distribution of degree through BA 
models [1]. Newman has successfully discovered distinct characteristics between 
social networks and nonsocial ones|20]. Various methods have been utilized to 
detect community structures. Among them, there are Newman's betweenness 
algorithm [HIIIl], Nan Du's clique-based algorithm [T^ and CPM|IT] that fo- 
cuses on finding overlapping communities. Clustering is another technique to 
group similar nodes into large communities, including L. Donetti and M. Miguel's 
method [5J which exploits spectral properties of the graph as well as Laplacian 



matrix and J. Hopcroft's "natural community" approach^lOj. Some social net- 
work models have been proposed [H] [52] [23] . 

With respect to core node detection, Roger Guimera and Luis A.Nunes Ama- 
ral propose a methodology that classifies nodes into universal roles according to 
their pattern of intra- and inter- module connections 4 . B. Wu offers a method 
to detect core nodes with a threshold [3 . Shaojie Qiao and Qihong Liu dedicate 
themselves to mining core members of a crime community [19 . 

As to dynamic graph mining, Tanya Y.Berger-Wolf and Jared Saia study 
community evolution based on node overlapping [6J; John Hopcroft and Omar 
Khan propose a method which utilizes "nature community" to track evolution [5|. 
However, both methods have to set some parameters, which is too difficult to 
be adaptive to various situations. In contrast, Keogh et al. suggests the notion 
of parameter free data mining [TSj. Jimeng Sun's GraphScope is a parameter- 
free mining method of large time-evolving graphs [Hj, using information theoretic 
principles. Our method in the paper shares the same spirit. 

As forerunners, A.L.Barabasi and H.Jeong study static characteristic vari- 
ations on the network of scientific collaboration [TS] . Gergely Palla and A.-L. 
Barabasi provide a method which effectively utilizes edge overlapping to build 
evolving relationship^. With the approach, they discover valuable phenomena 
of social community evolution. 



3 Symbol Definition. 

The table below lists almost all the symbols used in the paper. 
Sym. Definition 

C|*^ Community of index i in snapshot t 

Nj^*^ Node of index i in snapshot t 

W^Nj^*"^) Weight of a node of index i in snapshot t 
Cen(iVf ^) Central de gree of node N^'^ '' 
Core{C^*^) Core node set of ^ 
A^ode(cf' ) Node set of cf ^ 
Edge{Cl''>) Edge set of Cf^ 
\Node{C)\ community C size 

C|*^ Cj*^"^^ C|*^ is a predecessor of cj*'^"^^ or cj*'^^^ is a successor of c|'^ 
^(t-k) ^ ^(t) ^{t-k) ancestor of cf^ 

Evol{cf^) Evolution trace of cf^ 
\Evol{cf^)\ Span of evolution trace of cf^ 

Definition 1. (COMMUNITY EVOLUTION TRACE). 

An evolution trace Evol{Cx'^) is a time-series of C^^^"^^ as follows: 

Evol{C^:^) := Cr^' ■ ■ ■ , > 0) 



where each community Cx^^\i G satisfies the condition that there exists at 
least one community Cx~^^~^\ and then ci*"*"'"^^ — > Cx*"*"*'. Note that more than 
one cornmunity is allowed to appear in the same snapshot t+i, like Cx^^\C^~^^^ 
both locating in the snapshot t + 1. \Evol{Cx^)\ is n + 1 

Definition 2. (ANCESTOR OF A COMMUNITY). 

The definition of a community's ancestor is as follows: Cf~^^ => cj*^ if there 
is an evolving chain cf~^'' Cx~^^^\ Cj*\k > 1) 

Definitions. (COMMUNITY AGE). 

The age of a community is time span between its birth snapshot and its 
current snapshot. Here in the Evol{Cx^) defined in the Definition 1, the age 
of Ci*^ = 1 and Ci*+^^ = 3. 

Definition 4. (MEMBER STABILITY OF A COMMUNITY). 
The member stability of a community C^*' is as following: 

_ NodejC^*^) n (jVorfe(Cf +^^) U Node{C^*+^^) . . . U NodejCi*^'^^)) 
~ Node{Ci*)) U (7Vorfe(Cf +^^) U 7Vode(C^*+^^) . . . U Node{C^n^^^)) 

where -> cf+^^ (z G [l,n]) 

Definition 5. (MEMBER STABILITY OF A COMMUNITY EVOLUTION 
TRACE). 

The member stability of a community evolution trace is the average stability 
value of all community having successors within the trace. Its definition is as 
following: '^MS{C^*^)/n, where C^*^ is the community having successors andn 
is the corresponding number. 



4 Basic dynamic characteristics of social networks. 

In this section, we are interested in the following three aspects: (1) how the scale 
of social networks evolves; (2) how the members of social networks evolve; (3) 
which nodes trend to live long lives. 

Note that the paper concentrates on social networks, but nonsocial networks 
are taken into account in that we must compare distinct characteristics between 
them. 



4.1 Datciset. 

Co-authorship networks in the field of condense matter, math and nonlinear. Here, 

nodes represent authors and edges are collaboration relationships of publish- 
ing papers. This three datasets include co-authorship information of Cornell 
e-print from 1993 to 2006, from 1993 to 2006 and from 1994 to 2006 respectively 



( http: / /arxiv.org| ) and we build 28, 28 and 26 network snapshots from them by 
making paxtial dataset in half one year as a snapshot. 

Cell phone network. In the network, a caller or callee is a node and the phone 
communication between them is an edge. The dataset includes call information 
within a duration of 20 weeks in a province of China and we gain 10 network 
snapshots by each including call information of 2 weeks. 

Email network. Here, a node is regarded as an email sender or receiver and 
an edge is considered as one email communication. This dataset from Enron 
(Ihttp : / /www . cs .emu . edu / enron / ) spans about 3 years and 32 network snapshots 
are obtained, each with a duration of 1 month. 

Collaboration network of movie actors. Nodes are movie actors and edges 
represent their collaborations. The dataset includes collaboration information 
from 1980 to 2002 (http://www.imdb.com). Each snapshot is 2 years. 

Internet network. From this dataset (http://sk_aslinks.caida.orgl, we get 29 
snapshots of Internet every 2 months. 

Vocabulary network. We get vocabularies related to computer in EI Village 
from 1993 to 2006 ( http: // www.engineeringvillage2.org.cn| . A node is a con- 
trolled term and if two controlled terms appear in the same article, an edge 
exists between them. In this case, a snapshot lasts a year duration. 

Software network of Ant, Tomcat4, TomcatS. Here, a node represents a class 
and an edge exists between them if two classes have the invoking relationship. 
Three datasets include 12, 19 and 21 versions respectively ( http:/ /www. apache. orgj 
and one version is used to establish a network. 



4.2 The evolution of network scale. 

As Figjl] shows, in each co-authorship network (cond-mat, math, non linear), 
the node number of networks at different snapshot gradually increase. The phe- 
nomenon is also observed in the network of movie actor. However, in the call 
network, such an increase trend is not very apparent and in the email network, 
we can see a fluctuant rise, but it falls in the latest snapshots. In our analysis, 
co-authorship datasets and movie dataset reflect worldwide cooperating situa- 
tions, which is relatively complete. In contrast, the call network only considers 
the situation of one province and the email network is from the Enron company. 
Both of them might reflect the partial change. In all, we can get the conclusion 
that social network scale inflates when it evolves. 



4.3 The evolution of social network members. 

Although the size of a network increases in the evolution, its members is always 
changing, that is, some members will leave the network and some will enter it. 
We make a statistics which indicates that during the whole evolution process, 
about 80% nodes appear in less than two snapshots (See Figj2]). Therefore, we 
concludes that members of social networks change dramatically and only a small 
proportion exists in the networks stably. 



4.4 Discovery of long life members. 

We are also interested in which nodes will get high appearance times in the net- 
work. Here, node degree is taken into account as a critical factor, which indicates 
the importance of some node in the network to some extent. We respectively cal- 
culate the correlation coefficient between node degree and appearance frequency 
in six social networks: cond-mat is 0.12; math is 0.13; non linear is 0.22; call is 
0.28; email is 0.44; movie actor is 0.14; In conclusion, nodes with higher degree 
will exist in the network with a larger possibility. 

According the conclusions in this section, we understand that a large propor- 
tion of nodes is so unstable that we can not rely on them too much but focus on 
those small stable nodes, especially when we want to track community evolution. 




Fig. 1. Network scale (node number) evolution. Snapshot id (X axis) and net- 
work scale (Y axis) 



5 Core-based algorithm of tracking community evolution. 

As discussed above, community structures are mined by many algorithms in 
every network snapshots. We are interested in how these communities evolves. 
For example, there exists a community in snapshot t, and what about its state 
in the next snapshot t+11 Does it split into smaller ones or merge into a larger 
one with another community? 
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Fig. 2. Node appearance distribution. Node appearance times (X axis) and per- 
centage (Y axis) 

Our algorithm, CommTracker, heavily relies on core nodes instead of the 
overlapping level of nodes or edges between two communities. From the exper- 
iments above, we have realized that most of nodes lacks stability. Therefore, 
taking advantage of not all nodes that include those high fluctuating ones but 
these representative and reliable core nodes, will be more accurate and effective 
to track community evolution. A good example is the co-authorship community 
where core nodes represent famous professors and ordinary ones are other stu- 
dents. The research interest of professors is usually that of a whole community. 
Moreover, it is harder for professors to change their research interest than for 
those ordinary students. 

In this section, the algorithm of core node detection is firstly introduced and 
then we present our core-based algorithm of tracking community evolution. 

5.1 Core Node Detection Algorithm. 

As discussed above, core nodes are of greatest importance in our evolution algo- 
rithm, so its preparation work, selecting core nodes from a community, is a key 
step. The structure of a community is too dynamic and unpredictable to set an 
empirical threshold to distinguish core nodes from ordinary ones. Unlike [3], the 
following method concentrates on not only effectiveness but also parameter free. 

A node can be weighed in terms of many aspects, such as degree, betweenness, 
page rank and so on. Generally, the higher a node's weight is, the more important 



it is in a community. Here, we give a node Ni a weight value W{Ni) according 
to its degree. 

In our algorithm, both the community topology and the node weight are 
considered as critical factors to distinguish core nodes from ordinary ones. In 
Algorithm 1, we present the whole algorithm. 




Fig. 3. Core detection illustration. 

The basic idea behind the algorithm is similar to a vote strategy. For each 
node Ni, it is entitled to evaluate the centrality of those nodes linked with it. 
Assuming that W{Ni) is higher than the weight of a linked node, W{Nj), then 
Ni is considered as more important node than Nj, so iV^'s centrality value should 
be incremented by a specified value while A'^'s value is reduced by a specified 
value. Here, \W{Ni) — W{Nj) \ is employed to represent the centrality difference 
between two nodes. Through the "vote" of all round nodes, if A^^'s centrality is 
nonnegative, it is regarded as a core node. Otherwise, it is just an ordinary node. 

As Fig(3]shows, W{Ni) = 6. The running result is that Cen(A^i) = 23, 
Cen{N2) = 12 whereas Cen{Ni) = CeniN^) = -5,Cen{Ne) = CeniNr) = 
Cen(iVio) = -4,Cen(A8) = CeniNg) = -3, CeniNs) = -7. Therefore, the core 
set are {A^i, A^z}- 

In general. Algorithm 1 is effective to detect core nodes in a small network 
scope, like community, where node distances are no more than 3 hops and each 
node has large probability to connect to all other ones. 

5.2 Core-based Algorithm of Tracking Community Evolution. 

Tanya Y.Berger-Wolf and Jared Saia propose a method based on the overlapping 
level of nodes that C'^'^^-' is a successor of C*^*-' if nodeoverlap{C^*\ C^*"'""'^^) > s 
[6]. However, to set a proper s is challenging for users. When members of a 
community change dramatically and s is given a higher value, C^*^^-* will be 
considered to disappear because of too low overlapping level between them, but 
in fact (7'*+^-* still exists. Otherwise, if s is set a bit low, doing so will give 
irrelevant communities more opportunities to become the successors of C'*\ 
leading to "successors explosion" and masking those real successors. 



Algorithm 1 CoreDetection(C) 



1: if W{Ni) = W{N2) = ... = W{Nr,) then 
2: return C 
3: end if 

4: Cen{Ni) = 0,ie [l,n] 
5: for every edge e € Edge{C) do 
6: Ni,Nj are nodes connected with e 
7: if W{N,) < W{N,) then 
8: Cen{N,) = Cen{N^) - \W(N,) ~ W{Nj)\ 
9: Cen{Nj) = Cen(Af,) + \W{N^) ~ WiNj)\ 
10: end if 

11: if W{Ni) > W{Nj) then 

12: Cen(Ari) = Cen{N^) + \W(N,) - W{N,)\ 

13: Cen{Nj) = Cen{Nj) - \W{N,) - W{Nj)\ 

14: end if 

15: end for 

16: coreset = {} 

17: for every node Ni £ Node(C) do 

18: if Cen{Ni) > then 

19: input Ni into coreset; 

20: end if 

21: end for 

22: return coreset 



Gergely Palla and A.-L. Barabasi provide an approach utilizing the over- 
lapping of edge between two communities 0, but it fails to deal with split and 
mergence amongst communities. As there are one C*^*^ and two cf~^'^\ cj*^^"*, 
in snapshot t and t + \ respectively, if the edge overlapping level between C*^*) 
and cl*^^"* is higher than that between C*-*-* and cj*^^'', cf'^^^ becomes the 
successor of C*^*-* while cj*^^'' is considered as a new born community. Actually, 
C^*^ may split into two parts. The similar problem also exists in the process of 
community mergence. 

The disadvantage of the method above is to treat all nodes in an unprejudiced 
way and it is not accorded with the reality where different nodes have different 
influences. Our method has deeply paid attention to such a difference so that it 
puts emphasis on core nodes. 

The basic thought of our algorithm can be described as: 

cf^ cj*^^-* if and only if (1) at least one core node of C^*-* appears in 
cf+^\ that is, Coreicf^) n 7Vorfe(c]*+^^) ^ (2) at least one core node of 
Cf+^' must appear in some ancestor community of c\*^\ that is, there exists 
one ^ Cf\ NodeiC^""^) H Core(cj*+'^) ^ 0. see FigQ 

For the first condition, it is reasonable to consider C^^'^'s core nodes appear 
in some succeeding community cj*^^"*, due to the representative quality of core 



Q(t-m) 

Fig. 4. Community Evolution illustration: core nodes are colored red and ordi- 
nary ones grey. As we seen, (1) in snapshot t+1, C*^*"'""'^^ contains two core node 
A^i,A^2 of C(*). (2) Node iVg has also been in C^*""), an ancestor of C^*\ There- 
fore, C(*+i) becomes the succeeding community of C^*^. In practice, if C*^*-' has 
no ancestor, then communities satisfying the first condition will become C*^*^'s 
successors automatically. 



Algorithm 2 Community Evolution(c|*'') 

1: Evol(Cf') = {Ct'} 

2: Core{&^'^) = CoreDetection(Cf' ) 

3: for every community Cj*^^' in snapshot t-\-l do 

4: CoreiCf'^^'^) = CoreDetection(C]*+^') 

5: if Core{Cf ^) n iVode(C]*+'>) / and Arode(C<*"'"') n Core{Cf+^^) / and 

6: establish the relationship cf^ Cj*^^' 

7: Evol(c]*+^') = Community Evolution(Cj*+^') 

8: Evol(cf' ) = Evol(Cf')UEvol(Cj*+'') 
9: end if 
10: end for 

11: return EvoliCf^) 




nodes. As to the second condition, if some community wants to become 

the succeeding one of a specified community c[^\ it must suffice that its core 
nodes appear in some ancestor of cf \ because of the stable quality of core 
nodes, that is , core nodes do not appear suddenly without any evidence in the 
past snapshots. 

We describe the whole algorithm in Algorithm 2. 

From the perspective of successors and predecessors, we provide a very straight- 
forward way to identify community split, community mergence, community 
birth and community death. Note that they are four phenomena that occurs in 
a single evolution trace. 

— Community Split: a community has more than one successor. 

— Community Mergence: a community owns more than one predecessor. 

— Community Birth: a community has no predecessor. 



— Community Death: a community has no successor. 
Fig|5] shows a typical example of community evolution. 




The size of C* is 5 and js 3 jhe member stability 

Tlie age of c' is 1 and C*+2 is 3 

The evolution trace span is 4 ^= '""""^^ 
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Fig. 5. Community evolution illustration. Red square points are core nodes. 



6 Two representative phenomena in the social network. 

In [7], Palla has performed two experiments only on cond-mat co-authorship 
and call networks: one is to find out the correlation between community size 
and age; the other is to uncover the correlation between evolution trace span 
and member stability. In his paper, he obtains conclusions that communities of 
larger size lead to longer lives and that if an evolution trace span is longer, its 
member stability is lower. We are interested in the two situations in other social 
networks and nonsocial ones. The results are shown in Figj6](a) and (b). 

Firstly, depending on CommTracker, we can discover similar phenomena with 
those proposed in the Palla's paper, proving that our method is efi^ective and 
correct. Secondly, it is obvious that 6 social networks display two common be- 
haviors we discuss above. On the contrary, nonsocial networks fails to own such 
behaviors. In nonsocial networks, it seems that the size of a community can not 
reflect its age and that a community with higher stability will live for a longer 
life. 

We calculate the correlation coefficients between community size and age 
(GROWTH) as well as between evolution trace span and member stability 
(METABOLISM) (See Table [l]). Apparently, in the 1st experiment, social net- 
works' values are positive while those of nonsocial ones are nearly all nega- 
tive. In the 2nd experiment, the values of social networks are negative whereas 
those of nonsocial ones are all positive. Two experiments reveal that we can 
differentiate social networks from nonsocial ones according to GROWTH and 
METABOLISM. 

One important reason contributing to such distinctions is that in social net- 
works, a community represents a group of persons with close connection and in 
nonsocial ones a community is just a cluster of objects. As we know, in social 
networks, if a community want to obtain a long life, it must undertake suitable 
member changes, that is, when some old core members retire, new ones take 
over responsibility in time so that the development of the community is well 
supported. Otherwise, if a community refuses to absorb new members, when the 




Fig. 6. (a) The correlation between community size (X axis) and community age 
(Y axis), (b) The correlation between evolution trace span (X axis) and member 
stability (Y axis). 





1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


GROWTH 


0.67 


0.45 


0.76 


0.2 


0.39 


0.31 


0.29 


-0.07 


-0.02 


-0.09 


-0.23 


-0.01 


METABOLISM 


-0.76 


-0.72 


-0.62 


-0.76 


-0.67 


-0.37 


0.25 


0.25 


0.23 


0.47 


0.51 


0.16 



Table 1. GROWTH and METABOLISM. (1) cond-mat (2) math (3) nonlinear 
(4) call (5) email (6) movie actor (7) Internet (8) vocabulary (9) ant (10) tomcat4 
(11) tomcats (12) random 



old core members exit from the community, it is possible that new core ones have 
not been cultivated, leading to quick disintegration. In contrast, the members of 
nonsocial networks are objects, not persons. For example, in software network, a 
community is a class cluster with similar functions. If a class cluster is designed 
well, it must experience little change and be used for a long time. 

7 Social network model. 

Nowadays, many social network models have been established. However, when 
we get some snapshots generated from these social networks, most of them fail 
to display the characteristic behaviors we have proposed above. In our view, a 
main defect is that a node will permanently exist in the network once it is added 
into the network. However, from the experiments shown in Section 4, a lot of 
nodes enter into the network and then quickly exist from it. Hence, how to revise 
existing models to make them more reasonable is a problem to be solved. 

7.1 Model introduction. 

Our model is based on the one proposed in Emily's model [2T], which takes into 
account social network aspects completely, such as meeting rate between pairs of 
individuals, decay of friendships, etc. Moreover, Emily's model indeed presents 
many static features of social networks. Therefore, we decide to adopt it as our 
model basis. Our model can be simulated directly using the following algorithm. 

Let Up — ■^N{N — 1) where A'^ is the network initial scale. Let Uf, = 
where is the degree of the i"* vertex. And let n,„ = ^ ^ 2^(2^ — 1). 

1. We choose ripro pairs of vertices uniformly at random from the network to 
meet. If a pair meet who do not have a pre-existing connection, and if neither 
of them already has the maximum z* connections then a new connection is 
established between them. 

2. We choose rimri vertices at random, with probability proportional to 
Zi{zi — 1). For each vertex chosen we randomly choose one pair of its neigh- 
bors to meet, and establish a new connection between them if they do not have 
a pre-existing connection and if neither of them already has the maximum num- 
ber z* of connections. 

3. We choose vertices with probability proportional to Zi. For each ver- 
tex chosen we choose one of its neighbors uniformly at random and delete the 
connection to that neighbor. 

4. We choose one vertex, if its degree Zi > z, the average degree, we delete 
it with the probability a; otherwise, we delete it with the probability (3. The 
process doesn't stop until kd vertices have been deleted. 

5. We add ka new vertices. For each new one, it establishes a link with a 
vertex v randomly and then it also connects to the vertex with highest degree 
from the neighbor vertices of v. 

Note that the first 3 steps have already existed in the Emily's algorithm while 
the last 2 steps are added by ourselves. The 4th step is responsible for deleting 



some existing vertices according to their degrees. The last step focuses on adding 
new vertices. In this step, we eHminate the hmit of maxiniuni connection in 
order to allow some vertices to get high degree. In reality, a community consists 
of vertices with distinct degrees while in the Emily's social network model, a 
community trends to be a clique due to the limit of maximum connection. 

As pointed out in ^IJ, the network is initialized by starting with no edges, 
and running the first two steps without the other three ones until all or most 
vertices have degree z* (we set the limitation as 85%). Then all five steps are 
used for the remainder of the simulation. 



7.2 Model stimulation. 

Six experiments have been performed with different parameters a, /?, ka and 
kd shown in Fig(7] In all stimulation, z* = 5, iV = 250, = 0.0005, ri = 2, 
7 = 0.005. When all the five steps are running, we get a snapshot every five 
repetitions. We consider 17 snapshots together. 



a(1) 







9 


GROWTH 
= 0.365 





10 20 

evolution trace span 
c(2) 

1 



" 0.5 




ID 20 
evolution trace span 

1 



METABOLISM 
= -0.24 



ID 20 

evolution trace span 



b(2) 



METABOLISM 
- -0.537 



20 
size 
tl(1) 



D 10 20 

evolution trace span 
d(2) 











cm 




o 


GRCWTH 



3 


-0.75 



0.5 




10 20 
evolution trace span 




10 20 

evolution trace span 



Fig. 7. Model stimulation, (a) a 0.8,/3 = 0.6,ka = fc<j = 3 (b) a = 0.5,/3 = 
0.5, /ca = /cd = 3 (c) a = 0.3,/3 = 0.8, fca = = 3 (d) a = 0.8,/3 = 0.3,fca = fed = 3 
(e) a = 0.5,^ = 0.5,fca = fed = 6 (f ) a = 0.5,(3 = 0.5,ka = 5, fc^ = 3 



8 Conclusions. 



In the paper, we firstly perform some basic experiments to explore those dynamic 
characteristics of social networks and it is discovered that a large percentage of 
nodes are so instable that we can not rest on them too much and that nodes with 
higher degree will appear more frequently during the evolution of a social net- 
work. Under the experimental results, we propose a novel core-based algorithm 
to track community evolution, which has the following features: (1) it is effective; 
(1) it is parameter-free; (2) it is suitable to discover split and mergence points. 
With the algorithm, wc uncover two representative dynamic features of social 
networks and define two coeflicients: GROWTH and METABOLISM by which 
we also achieve the goal of telling social networks from nonsocial ones. In the 
end, wc propose a revised social network model which can display two typical 
characteristics. The experiments are based on 6 social networks (co-authorship 
network, call network, movie actor network and email nctwork)and 5 nonsocial 
networks (Internet, vocabulary network and software network). 
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